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MEAN FIELD CONVERGENCE OF A MODEL OF MULTIPLE TCP 
CONNECTIONS THROUGH A BUFFER IMPLEMENTING RED 

By D. R. McDonald 1 and J. Reynier 2 

University of Ottawa and Ecole Normale Superieure 

RED (Random Early Detection) has been suggested when mul- 
tiple TCP sessions are multiplexed through a bottleneck buffer. The 
idea is to detect congestion before the buffer overflows by dropping 
or marking packets with a probability that increases with the queue 
length. The objectives are reduced packet loss, higher throughput, 
reduced delay and reduced delay variation achieved through an equi- 
table distribution of packet loss and reduced synchronization. 

Baccelli, McDonald and Reynier [Performance Evaluation 11 (2002) 
77-97] have proposed a fluid model for multiple TCP connections in 
the congestion avoidance regime multiplexed through a bottleneck 
buffer implementing RED. The window sizes of each TCP session 
evolve like independent dynamical systems coupled by the queue 
length at the buffer. The key idea in [Performance Evaluation 11 
(2002) 77-97] is to consider the histogram of window sizes as a ran- 
dom measure coupled with the queue. Here we prove the conjecture 
made in [Performance Evaluation 11 (2002) 77-97] that, as the num- 
ber of connections tends to infinity, this system converges to a deter- 
ministic mean-field limit comprising the window size density coupled 
with a deterministic queue. 

1. Introduction. Imagine the scenario where N work stations in a uni- 
versity department are connected by a switched ethernet to a departmental 
router. If every work station simultaneously FTPs a file to some distant ma- 
chine, then the output buffer in the router will be a bottleneck. We study 
the interaction of N TCP /IP connections in the congestion avoidance phase 
of TCP Reno routed through a bottleneck queue. 

Upon receiving a TCP packet, the recipient sends back an acknowledg- 
ment packet so there is one Round Trip Time (RTT) between the time a 
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packet is sent and the acknowledgment is received. The acknowledgment 
contains the sequence number of the highest value in the byte stream suc- 
cessfully received up to this point in time. By counting the number of packets 
sent but not yet acknowledged, each source implements a window flow con- 
trol which limits the number of packets from this connection allowed into the 
network during one RTT. Duplicate acknowledgments are generated when 
packets arrive out of order or when a packet is lost. 

The link rate of the router is NL packets per second. We assume pack- 
ets have equal mean sizes of 1 data unit. We assume the packets from all 
connections join the queue at the bottleneck buffer and we denote by Qjv(i) 
the average queue per flow. We assume the scheduling to be FIFO. 

We imagine the source writes its current window size and the current 
RTT in each packet it sends, where by the current RTT we mean the RTT 
of the last acknowledged packet: 

• W^(t) is defined to be the window size written in a packet from connection 
n arriving at the router at time t. 

• (t) is defined to be the RTT written in a packet from connection n 
arriving at the router at time t (this RTT is the sum of the propagation 
delay plus the queueing delay in the router). 

• We shall assume that connections can be divided in d classes K c , where 
c £ [1, 2, . . . , d] where the transmission time T n of any connection n € K c 
equals the common transmission time T c for class c (the notation will be 
clear from context). 

At time t, source n has sent W^(t) packets into the network over the 
last R^ (t) seconds. The acknowledgments for these packets arrive at rate 
W^(t)/R^ (t) on average. New packets are being sent at the rate acknowl- 
edgments come back to the source so we will define the instantaneous trans- 
mission rate of source n at time t to be X^(t) = (£)/ 'R% (t). This def- 
inition models the transmissions over any long period of time T. During 
time T, the total number of packet-minutes of work done by the network for 
connection n is 



Consequently, our definition of the transmission rate is consistent with a 
Little type formula which calculates the work done as the integral of the 
packet arrival rate X^(t) times the work done for each packet, R^ (t). 

Under TCP Reno, established connections execute congestion avoidance 
where the window size of each connection increases by one packet each time a 
packet makes a round trip, that is, each R^ as long as no losses or timeouts 
occur. During this phase, the rate the window of connection n increases 
is approximately packets per second. The only thing restraining the 
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growth of transmission rates is a loss or timeout. When a loss occurs the 
window is reduced by half. 

The source detects a loss when three duplicate acknowledgments arrive. 
The source cuts the window size in half and then starts a fast retransmit /fast 
recovery by immediately resending the lost packet. Fast retransmit/fast re- 
covery ends and congestion avoidance resumes when the acknowledgment of 
the retransmitted packet is received by the source. We assume the losses are 
only generated by the RED (Random Early Detection) active buffer man- 
agement scheme or by tail-drop. We neglect the possibility of transmission 
losses. 

We also neglect the possibility that some of the connections fall into time- 
out. This may occur if there is a loss when the window size is three or less. 
In this case there can't be three duplicate acknowledgments. The source 
can't recognize that a loss has occurred and essentially keeps on waiting for 
a long timeout period. Alternatively, if a retransmitted packet is lost, the 
source will fall into timeout. Two losses in the same RTT may not produce 
a timeout and may have a different effect on the window size depending 
on the version of TCP being used. The detection of the second loss may 
provoke a retransmission, but no window reduction with NewReno [11] or 
with SACK [13] or will only provoke a second window reduction after the 
acknowledgment of the retransmission of the first lost packet; that is, the 
discovery of the second loss occurs more than one round trip time after the 
second loss occurred. When the timeout period elapses, the source restarts 
quickly using slow-start and attempts to re-enter the congestion avoidance 
phase. Losses which occur simultaneously with packets arriving out of order 
are also a major cause of timeouts. In practice, a certain proportion of the 
connections will be in timeout at any given time. In effect, one has to rede- 
fine N if one wants to compare theoretical predictions with simulations (see 
[2])- 

We will assume the large buffer holds B packets and that, once this buffer 
space is exhausted, arriving packets are dropped. Such tail-drops come in 
addition to the RED mechanism. Here we take the drop probability of RED 
(of an incoming packet before being processed) to be a function of the queue 
size which is zero for a queue length below Q m in but rises linearly to p ma , x 
at Qmax and is equal to 1 above Q max - 

Note that this is not exactly as originally specified in [12], where the drop 
probability was taken as a linear function of the exponential moving average 
of the queue size. Since we will let the number of sources tend to infinity, 
this averaging out of fluctuations is not necessary and, in fact, is deleterious 
since it adds further delays into the system. 

If all N connections are in congestion, avoidance, we can reformulate this 
drop probability in terms of Q N , the queue size divided by N, as F(Q N (t)), 
where F is a distribution function which is zero below q m in = Qmin/N but 
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rises linearly to p max at <7max = Qmax/-^ and jumps to 1 at q max - Of course, 
the tail-drop scheme can be considered as the limiting case when g m ; n = 0, 
QWx = b (B = Nb) and p max = 0. 

In [2] we used a fluid description of the queue and a continuous approx- 
imation of the loss rate of each connection to construct a model for the 
evolution of the fluid queue as a function of the histogram of window sizes. 
This model generalized the model in [15]. The goal of this paper is to prove 
the conjecture in [2] that the histogram or empirical measure (t,dw) of 
the window sizes in any class c id = 1 in [2]) converges to a deterministic 
mean field limit with measure M c (t, dw) at time t and, moreover, the relative 
fluid queue size Q N (t) converges to a deterministic fluid queue Q(t). 

Construct a sequence f k of bounded, positive continuous convergence 
determining functions on [0,oo) (see pages 111 and 112 in [10]). Define a 
metric for weak convergence for probability measures on [0, oo) by defining 
the distance between probabilities fj, and v as 



k=l 

where (/, fi) = J °° f(w)fi(dw). Also, let || ■ || s denote the Skorohod distance 
between two elements of D[0,T]. 

Theorem 1. Under Assumptions 1 and 2 given in Section 2, the ran- 
dom measure of the window sizes of connections in each class c converge 
in probability to a deterministic measure M c (t,dw); that is, \\M?(t,dw) - 
M c {t,dw)\\ w — >-0 in probability as N — > oo. M c (t,dw) is the marginal distri- 
bution of M c (s — R c (s),dv; s,dw) , the deterministic limit of the joint distri- 
bution of the window sizes at time t and at time t — R c (t). 

Let G = {g£ Cl(R + ) : g(0) = 0}, where C b 1 (M + ) is the space of bounded 
functions with bounded derivatives. For g 6 Q, c = 1, . . . , d, 



oo 




[l — V 



w 



:=£l A |</ fe ,M> -</*»| 



{g,M c (t))-{g,M c (0)) 



(1.1) 




+ {(g(w/2) - g{w))v, M c {s - R c {s),dv; s 



dw)} 




K(s-R c (s)) ds 



(1.2) 




+ i(9(w/2) -g(w)),e{s,s 



R c {s),w)M c {s,dw)) 



1 



K(s-R c (s)) ds, 



x 



R c (s - R c (s)) 
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where (g,M c (t)) = f™ =0 g(w)M c (t,dw) , where R c (t) =T C + Q(t - R c (t))/L, 
where K{t) = F(Q(t)) for Q{t) < Qmax and where 



_ I M c (s - R c (s),dv;s,dw) \ 
j V' M c (t,dw) J 



e(s, s — R c (s), w 

is the conditional expectation of the window one RTT in the past, given the 
window is now w. 

Moreover, the queue size converges in probability to a deterministic limit 
Q(t) satisfying 

dQ(t) J, c/ , , S (1-K(t)) 

(1-3) 

- ^£K c (w,M c (t,d W )) (1 - Lj X {Q(t)=0}. 

When Q(t) = g max , K{t) is determined by 

K(t) = max(p max , ^^k c ( W , M c (t, dw)) - ^^^ 1 ) > 

where {w,M c (t,dw)) = J w wM c (t,dw) is Lipschitz continuous in t for each 
c. 

The numerical evaluation of the above equations is considered in Section 7. 

Note that there may be a discontinuity in Kit) when Qit) hits (^max- 
The problem at g ma x arises because F is not continuous and certainly not 
Lipschitz at this point. To justify this definition, we consider the Gentle 
RED variant [18], where b > 2(? max , and we extend the definition of F to rise 
linearly from p max to 1 between g max to 2g max so F is Lipschitz. In Section 5 
we modify Gentle RED so that the drop probability increases linearly from 
Pmax at (/ max to 1 at (/ max + 5 < b. The weak limit of this modified Gentle 
RED as 5 — > gives the discontinuous K{t) above. The reflection problem 
at queue size zero can be solved by the Skorohod construction. 

It is important to emphasize that we have not justified that the fluid 
models in [15] or [2] are the limit of some discrete packet level model. We are 
proving far less; that is, that the intuitively attractive fluid models in [15] or 
[2] do converge to a mean field limit. This convergence is implicitly assumed 
in the engineering literature and the resulting limit processes are used to 
analyze the stability of various Active Queue Management (AQM) control 
strategies (RED among others) [15]. We also make the simplifying modeling 
assumptions made in [2], although more precise alternatives are suggested 

A ' 
dv 



(see the Doppler factor, [1 — 4fRn (t)] in (2.1)). These simplifications (or 
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the assumption that mean windows one RTT apart are uncorrelated made 
in [15] but not by [2]) may be too gross and, to date, nobody has done 
the network measurements to check which assumptions make a significant 
difference. This is a major failing because the control theoretic analysis, 
as proposed, for instance, in [15] or [8], may be highly sensitive to these 
assumptions. 

The conclusion, as far as RED is concerned, is negative. If the RED param- 
eters are not chosen properly (as a function of RTT), then RED is unstable. 
The fixed point described in [2] may be unstable and a tiny oscillation is 
amplified until the queue size oscillates wildly. Even with a large buffer, the 
utilization may drop to less than one. Although one may criticize lacunae 
in the model, this conclusion is verified by simulation and for this reason, 
RED is rarely activated even though it is implemented on most routers. 

We are mainly interested in a mathematical proof of the convergence to 
the mean field so we will ignore timeouts and slow-start, as well as spe- 
cial details of congestion avoidance which would only serve to obscure the 
main ideas. We will, nevertheless, sketch how these extensions could be han- 
dled. Our method could be adapted to proving mean field limits for control 
schemes other than RED. [19] and [7] analyze time slotted rate-based and 
queue-based models with delay where the number of sources tend to infinity. 
For their queue-based model, they prove the convergence of the queue to a 
deterministic limit and propagation of chaos; that is, the transmission rates 
of each source converge to a system driven by the deterministic queue size. 
In their model the limiting, deterministic queue is not coupled with limiting 
distribution of the window sizes and there is no associated partial differential 
equation as in [2]. The only comparable analysis is the recent thesis [3]. 

The structure of the paper is as follows. In Section 2 we model the system 
of N windows coupled with the queue and then formulate this as a histogram 
of window sizes coupled with a queue. In Section 2.3 we summarize the mean 
field limit. The proof of the existence of this limit follows in Section 3. In 
Section 4 we establish the convergence to a unique limit. Finally, in Section 6 
we establish Theorem 1. 

The TCP model we are studying can be viewed as iV dynamical systems 
[the N window sizes := (W^ , . . . , Wjy)] which evolve independently 
except through a shared resource (the queue Q N ). The dynamics of the 
shared resource depend only on the distribution of the dynamical systems 
(in this case on the average window size) . The standard approach is to prove 
existence and then uniqueness of the limit. That's what we do here, but the 
mathematical innovation is to first create a modified system [here (W , Q N ) 
is modified to (VV , Q M )\ where the dynamics of the shared resource (the 
modified queue Q N ) depend on the expected value of the distribution (or, 
in this case, the expected value of the average window sizes of the modified 
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system). Essentially, we just stick an expectation in front of the interaction 

term [here we replace the average window size W (t) by EW (t)]. 

Since the modified shared resource Q N is deterministic, the modified dy- 
namical systems are independent. Moreover, it is easy to pick a convergent 
subsequence for the shared resource (here for Q N — > Q). It is then easy to 
prove converges to a limit W n along the subsequence for each compo- 
nent n. This gives the existence of an infinite modified system [here (W, Q), 
where W = (Wi, W2, •••)]■ Next, the key remark is that by the law of large 
numbers (and boundedness), 

1 N 

W(t):= lim -J2W n (t)=EW(t); 

n=l 

that is, the infinite modified system is, in fact, a limit of the original system! 
Here this means we can rewrite (W, Q) as (W, Q) with W := {W\, W2, • • • , ) 
since the interaction term is W(t); that is, the interaction is through the 
window average and not the expected value of the window average. 

Next, we use a coupling argument to show each original dynamical system 
(here each W^) converges almost surely in Skorohod norm to the infinite 
limit (here converges to W n ). This proves the propagation of chaos 
where each dynamical system (W n ) is independent and interacts with the 
other systems only through the deterministic shared resource Q. From this, 
we can show the mean field convergence (here the convergence of Q N to Q 
and the histogram of the to the mean field limit). 

We have used Kurtz's approach (e.g., [9]) of bringing back the particles; 
that is, not projecting onto the histogram of window sizes as is standard 
(e.g., [5]). We believe this is the most effective way to handle feedback delay 
because the histogram of window sizes is not a state. We think our approach 
has potential in other contexts. 



2. The iV-particle system and mean-field limit. 



2.1. The N -particle Markov process. Our model takes into account the 
delay of one round trip time between the time the packet is killed and 
the time when the buffer receives the reduced rate. We assume window 
reductions at connection n occur because of a loss one round trip time in 
the past. To first order, the probability of a window reduction between time 
t and t + h is 



(2.1) 



t+h-R»(t+h) w*(s) 
- R N {t) RTTj{ t 



K 



N 1 



\ds 



1 



dt 



W«{t-R»{t)) 



K N (t - R%(t))h, 



N 1 



<s 
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since the probability a packet is dropped is proportional to W„ (t — R^ (t))/ 
(t — R^ {t)), the transmission rate one RTT in the past, times K N (t — 
R^ (it)), the drop probability one RTT in the past. The Doppler term [1 — 
4iRn {t)] is a small correction that was overlooked in [2] and we will ignore 
it here. 

There are many ways of actually implementing packet drops once the 
drop probability p = K N (t) is determined at time t. We could drop packets 
deterministically one every 1/p packets, but this may introduce unwanted 
synchronization. In fact, [12] proposes two methods. In the first we sim- 
ply generate a Bernoulli random variable with probability p of dropping a 
packet. In the second the dropped packet is chosen uniformly among the 
next 1/p packets. In fact, it won't matter which method is employed since, 
as N — ► oo, the contribution of each flow becomes negligible. Consequently, 
the packet arrivals of connection n are enormously spread out among the 
other packets. As far as connection n is concerned, packets are dropped ran- 
domly with probability p = K N (t) at time t. We therefore model the process 
of window reductions by a Poisson point process with stochastic intensity 

\N (f \ ._ W n(t - R n {t)) K N (f T>N (f \\ 

{t) - Rj{t - WW) {t ~ Rn{t)) 

[we can assume W^ 1 (t) = w n for t < 0] . 

Of course, the second method proposed in [12] would induce a weak de- 
pendence between the Poisson processes for different connections. However, 
the interaction between flows is via the average window size and the minor 
weak dependence won't prevent the average from converging to a determin- 
istic limit. We will assume the first method is used, but it would be possible 
to alter the argument in Sections 3 and 6 to account for weak dependence. 

We can construct the simple point process of window reductions: 

ft poo 



where the (u,v) are two-dimensional Poisson processes with intensities 
1 on [0, T] x [0, oo). In addition, the sources evolve independently given the 
trajectory of K N . We therefore assume the T^(u,v) are independent. In 
order to derive strong convergence theorems as iV — > oo, we shall suppose, 
without loss of generality, (u,v) = T n (u,v), where {T n } n= i j ... ]00 is a se- 
quence of i.i.d. two-dimensional Poisson processes with intensity 1 defined 
on a probability space (£l,J-, P). In fact, Poisson processes in {(£, it) : < u < 
A(t),0 — t — P} would do where A is defined after Assumption 1 as an a priori 
bound on the transmission rate. Define Tt = cr{T n (u, v);v < t, n = 1, 2, . . . }. 

The above is a different version of the point process of window reductions 
than that in [2]. The laws are the same, so the resulting dynamical systems 
have the same distribution. Consequently, the convergence in probability 
proved in Theorem 1 is also valid for the version used in [2]. 
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Differential equation for queue size. For Q(t) < q mSLX , K N (t) = F(Q N (i)) 
and the rate of change of the fluid buffer is given by 

+ (^1§§ (l-K N (t))-N L ) X {Q N (t) = 0} 

since the proportion K N (t) := F(Q N (t)) of the total fluid, 

f W»(t) 

is lost. The second term prevents the queue size from becoming negative. 
In effect, the queue can stick at until a sufficient number of connections 
increase their window size. 
Dividing by N gives 

(2.2) 

+ I] -*"(*>) -^) *«"(«) = <», 

with Q N (0) =q(0). 

If Q N (t) reaches q ma , x and 

N M/JV/ 



(l-v )ivM >i 



then the queue must jitter at q max and the loss probability K N (t) is deter- 
mined by 

N TirN I 



{1 ~ K {t)) Nk rM~ l - 



In other words, if Q (t) = q max , then 

To justify the above definition of K N (t), one would really need to show the 
loss probability of a packet model jittering a q mSLX converges weakly to K N (t). 
Instead, we show the loss probability as Gentle RED converges to K N (t) as 
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Gentle RED converges to RED (see Section 5). We should also contrast this 
loss rate with that for small buffers (i.e., B is constant as N — > oo) studied by 
[17]. For a small buffer, fluctuations will cause packet losses long before the 
total transmission rate reaches the link rate NL. Essentially one can model 
K N as L b ( xtv En=i m ) > where Lb can be calculated by finding the 
equilibrium distribution of a suitable Markov chain as in [16]. However, since 
our buffer is scaled with N, fluctuations like this can be ignored. It is worth 
noting that our method would allow us to prove mean field convergence for 
the small buffer case. There would be no equation for the queue and the 
round trip times are constant. 

Differential equation for windows. There are three separate phases: con- 
gestion avoidance, timeout and slow start. We concentrate on describing the 
congestion avoidance phase. During congestion avoidance, while the queue 
is nonempty but less than the buffer size, the window size increases by one 
every time a complete window is acknowledged. In [2] and [15] this term was 
taken to be simply 1/R n (t) (i.e., one packet increase per RTT) and we make 
the same approximation here. Note that this approximation ignores the fact 
that acknowledgments return to the sources at the link rate NL when the 
queue is nonempty. 

If the source detects the loss of a packet at time t — {t) because three 
duplicate acknowledgments arrive, the source cuts the current window size 
Wn(t~) by half to W£ (i~)/2. The slow start threshold (ssthresh) is set 
to Hn(t) = W~f(t~)/2. The source then begins fast retransmit and fast 
recovery. The lost packet is retransmitted and, through window inflation 
packets, continue to be sent as if the window size is constant [or at least 
the average transmission rate is consistent with a constant window size 
W*(t~)/2]. We will ignore this effect and assume the window size increases 
at a rate of 1/R n (t) even during fast recover; that is, we don't include the 
term (1 — Xs„(t)) i n [2]- When the retransmitted packet is acknowledged, 
congestion avoidance resumes. Hence, the evolution of the window size in 
the congestion avoidance phase is described by the following stochastic dif- 
ferential equation: 

(2.3) dW»{t) = dt - ™£lp. dN»(t), 

with Wn(0) = w n , n = 1, . . . , N, specified. Denote the vector of window sizes 
by W N (t). 

If we wished to model timeouts, we could define a function U{W^{t~)) 
equal to one if the connection falls into timeout during fast recovery and 
zero if not. Hence, the point process of falling into timeout is given by 
U(Wn (t~))dNn (t). During the timeout phase, the window size is zero. 
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The connection is described by ssthresh and the remaining time in timeout. 
After the timeout phase elapses, the source enters slow-start and doubles 
its window size starting from one every RTT until the window size reaches 
ssthresh, at which time congestion avoidance restarts. If another loss is de- 
tected before reaching the congestion avoidance phase, the connection will 
go into timeout. During the slow-start phase, the connection is described by 
the window size and ssthresh. At any time t, a certain proportion of the N 
connections will be in each phase. In the mean-field limit these proportions 
will converge to deterministic fractions. We will not show this here. In fact, 
we will simply ignore all the special details of fast recovery and timeouts. 



Assumptions on the initial state. 



Assumption 1. (i) Prior to time zero, the window size of connection n 
in the N connection system is a constant w n , where < w n < PF max . 

(ii) The transmission time T n of connection n satisfies T m i n <T n < T max 
for all n. 

(iii) Q N (0) = q(0) a constant. 

Bound a(t) for the window size at time t. From Assumption l(i), 

a(t) := WW + -1- > w n + 1- > W?(t) 

at every time t. The stochastic intensity for the Poisson point process of 
losses of connection n is A^(s) < a(s)/T min =: X(t) for all < t < T. 

Note that (1 - K N (t)) > 1 -p max as long as Q N (t) < q max . If Q N (t) = 
<? max , then (1 - K N (t)) > LT min /a(t). Either way we have (1 - K N (t)) > 
(1 - U) > 0. 

Relation between RTT and queue size. Define (s) to be the future 
round trip time written into a packet leaving the source at time s. For 
the above scenario, (p^ (s) = T n + Q N (s)/L. Also note that s + 4>^ (s) is 
monotonic because the derivative, if Q N (t) < q mSLX , is 



1+ ldQ N ( 



L ds 



1 /if 1^(0 K N (f)) T \ 



This is positive unless all the window sizes are zero and this has probability 
zero. If Q N (t) = g max , then the derivative is one. 

Now define the RTT of connection n as marked in packets arriving at 
the router at time t by (t) =t — s = 4>^ (s) if s + (s) = t. Since s + 
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0„ (s) is monotonic, i?„ (t) is well defined and (t — (t)) = (t). Also, 
substituting R% (t) = t- s into c/>%(s) =T n + Q N (s)/L, we get that R% 
satisfies 



(2.4) R»(t)=T n + Q N (t-R»(t))/L. 
Moreover, by taking the derivative of (2.4), we get 

(2.5) (1-H»(t)) 



1 + Q*(t-R»(t))/L 



2.2. Reformulation in terms of a measure-valued process. 



Classes of connections. We will assume there are d classes of connec- 
tions K c , c = 1, . . . ,d, and all connections in class c have the same trans- 
mission time T c . Hence, R% = Rc f° r an n G Kc- We will also assume the 
proportion of the N connections in class c is . In addition to Assumptions 
1, we assume the following: 



Assumption 2. Assumptions on connection classes: 

(iv) The proportion of users in the class c: — ► k c for c = 1, . . . , d as 
iV^oo. 

(v) Let fj,^ be the histogram of windows of connections from class c at 
time 0. We suppose that, for all c, ^ converges weakly to \i c as N — > oo, 
where the support of fj, c is concentrated on [0, W max ]. 



Measure-valued process. In order to study the limiting behavior of the 
system as the number of connections N goes to infinity, we will first define 
the empirical process (see [5, 6]) of those connections in class K^- For any 
Borel set A, define 

1 N 

(2.7) Mf(t, A) := — £ XA(W^(t)) X {n G K c } 

c n=l 

to be the associated probability-measure-valued process taking values in 
Afi(M + ), the set of probability measures on R + = [0, oo) furnished with the 
topology of weak convergence. 
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How the future is determined. The sequence {T n (tt,u)} ne nj of indepen- 
dent Poisson processes with intensity 1 was defined on a probability space 
P}. W N (t) = (Wf(t), (t)), Q N (t) and M N (t) = (Mf (t), . . . , 

Mj (t)) can be constructed path by path as processes defined on P} 
taking values in (K + )°°, M + and Mi(R + ) d , where the coordinates of W N (t) 
above N are zero. It suffices to assume (t) = w n for t < and build the 
solution of the system (2.3), (2.2) pathwise from jump point to jump point 
of T n . 

Reformulating (2.3). Let 

(g,v) = J g{w)n{dw) 

and 

(Id, jj) = \ Wfi(dw), 



SO 



1 N 



W:(s) := (Id,M c N (s)) = £ W^Wx{t> e K c }. 



K^f N 
c n=l 



If g EG, then 
( 5 ,Mf(t))-( 5 ,Mf(0)) 



/jj/N ( w ds 



1 r* 



+ (3(W'«(«-)/2) - <,((V"(s-)))dAr«( s ; 



In Section 6 we consider the limit of the above as N goes to infinity to obtain 
an equation for the evolution of the distribution of the windows. 

Reformulating (2.2). For Q N (t) < g max , K N (t) = F(Q N (s)) and 

Q N (t)-Q(0) 

rtf d ft ISN I 



(2.9) =l^(U,M?(s))^^-L 



where 

(2.10) R*(t)=T c + Q N (t-R*(t))/L. 



ds, 
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If Q N jitters at g max , then the loss probability K N (t) is given by 

K N (t) = max|p max , 1 - L ( XX ^ M ? ^ 'r?^ ) ' } " 

2.3. Summary of the mean -field limit. We wish to show that (W^ff), Q N {t)), 
the unique solution to the iV-particle system, converges as N — ► oo. In Sec- 
tion 3 we first prove the existence of the following limit. 

Theorem 2. 7f Assumptions 1 and 2 /ioW, i/ien i/iere exists a unique 
strong solution (W, Q, (-Mi, . . . , M^)) to £/je following system. For Q(t) < 
= F(Q(t)) and 



Q(t) - Q(0) 



(2.11) 



o 



When Q(t) = q ma , x , then K{t) satisfies 

K{t) = max j Pmax , 1 - L K °( Id > R^ ) 
Each window evolves according to 
(2.12) dW n (t) = -i- dt - dN n (t), 

where W n (0) =w n , n = 1, . . . , are specified, where 

N n( t ) = J J X[o,x n (t)]{v)dT n (u,v), 

where 



W n {s - R n {s)) 
R n (s - R n (s)) 



and where M c (t) is defined by 



1 N 

(g,M c (t))= lim Y.9(W n (t)) X {neK c } 

TV— >oo Kl iv — ' 

c 71=1 
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and, as a consequence, we can define 

W c (s) = (Id,M c (t)) = lim -i^ J2 W n {s) X {n € K c }. 



1 N 



c n=l 



Consequently, from (2.11), 

Q(t) - o(o) = / Yl ^ttttC 1 - - L 

Jo l^i Rc{s) 
(2.13) _ 

+ (E kC ^( 1 -^))- L ) x{Q(s) = 0} ds. 

The solutions Q(t) and R(t) = (Ri(t), B,2(t), . . . ) are deterministic, as are 
the M c {t). Finally, the components o/W are independent processes. 

Define 

(2 14) S N (s) -if *™ - T r N W " {s) 

where VP^s) is the average window size of connections in K c among the 
first Wi , . . . , Wjy . Define Sn(s) analogously from W. Prom Theorem 2, we 
can define 

(2.15) 3( fl )= lim S N (s)= lim 1 f) V 



iV^oo N-*oo N ^ R n (s) ' 

In Section 4 we will prove Theorem 3 and show that there is only one 
strong solution to (2.13), (2.12) and that, in fact, the solution to (2.3) and 
(2.2) converges to this strong solution. 

Theorem 3. If Assumptions 1 and 2 hold, then \\M^ (t) - M c (t)\\ w , 
\\Q (t) — Q(t)\\ s and \\K N (t) — K(t)\\ s converge to zero in probability, where 
M c (t), q(t) and K{t) are deterministic functions of t € M + into M±(M + ), 
M + and IR + , respectively, given in Theorem 2. 

Let P N be the measure induced on D d [0, T] x C[0, T] by ((PP~f , . . . , W%), 

Q N ). 

LEMMA 1. Under Assumptions 1 and 2, the measures P N are tight. 

Proof. We check the conditions for Theorem 12.3 in [4]. Condition (i) 
is immediate since the sequences ((Wi,...,Wa),Q N (t)) are bounded. 
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In condition (ii) we are given positive constants e and rj. We can pick A 
sufficiently small that the maximum growth of a window over a duration 
of length A is less than e/2; that is, pick A < T m ; n e/2. Also pick A suffi- 
ciently small that the probability of the event B that a Poisson process with 
intensity A(T) jumps twice within a duration A is less than rjea(T)/2. 

Note that, by the construction of the window , the event that the 
window is cut in half two times within a duration A up to time T is contained 
in B n , the event where T n (i, A(T)) has two jumps in an interval of duration 
A. Also, note the worst oscillation a window can make over a duration A is 
a(T); that is, the biggest drop possible. Hence, the modulus w' c (A) of any 

trajectory of W c , c = 1, . . . , d, as defined at (12.6) in [4] satisfies 



Since we can make the above estimates simultaneously for each of the 
d classes, we see condition (ii) holds for the oscillations of (W 1 , . . . ,W d ). 
The oscillations of Q N (t) and of (Ri(t), . . . , R d (t)) are uniformly bounded 
in C[0,T] because each trajectory is uniformly Lipschitz. This shows the P N 
are tight. □ 

Using Lemma 1, we can extract a subsequence such that Q N and 
{W l , . . . , W d ) converge almost surely to Q°° and (W^ , . . . , W^°) in Skoro- 
hod norm. The convergence of the components W£ follows. Unfortunately, 
we wouldn't even know that the limits Q°° and (W™ , . . . , Wj) are deter- 
ministic. 

Using the above method and Jakubowski's criterion (cf. [6], Theorem 3.6.4), 
we might be able to check that the measure valued processes M^(t), . . . , M d (t) 
in D([0,T],Mi(R + )) are tight. We might even check that the bivariate mea- 
sure valued processes Mf(t — R N (t),dv;t, dw), . . . , M d \t — R N (t),dv; t, dw) 
in D([0,T], M\(M. + ) 2 ) are tight. We could then pick a convergent subse- 
quence and carry out the analysis in Section 6. We would obtain a limiting 
solution to Theorem 1. Unfortunately, we won't know the solution is unique 
(and deterministic) because (1.1) and (1.3) don't determine the solution 
since (Mi(t), . . . , M d (t)) and Q(t) is not a state (because of the delay in the 
system) . 



<(A)<-$>(r) XBn + £ /2). 



71=1 



Hence 




<-^rP(B)<r,. 
ea(l ) 
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It may be possible to rectify this by defining the state at time t to be 
the entire trajectory of the measures M± (t), . . . , (t) and Q(t) back at 
least one RTT. Indeed, the numerical procedure proposed in Section 7 shows 
how to maintain all this information to numerically solve (1.1) and (1.3). 
However, instead of trying to characterize tightness of such an ugly space, 
we will proceed in a more direct manner in the next section. 

3. Existence of a limit. In this section we show the existence of the 
solution to (2.13), (2.12). 

3.1. Modified system. We now introduce the modified system discussed 
in the introduction where Q N is forced to be deterministic by modifying 
the equation for the evolution of the queue to (3.1). Then we extract a 
deterministic limit that turns to be a limit of our initial system. 

Let JC N (t) = F(Q N (t)) if Q(t) < g max , where Q N is defined by 

Q N (t)-Q(0) 



(3.1) 



ds, 



where W N 
w n , where 

(3.2) 

and where 



(W 



L) x{Q(s) = 0} 
i , . . . ,yV$) satisfies the analogue of (2.3), where W n (0) 
■R?(t)=T c + Q N (t-K?(t))/L, 



1 N 

(Id, M»(s)) = ^ E Wn(s)x{n G K c }. 



n=l 



If Q N {t) =q ma , x , then the loss probability K N (t) satisfies 



N i 



(3.3) 



K N {t) = max p max , 1 - L ^ K . 



,c=l 



N E {Id,M»(t)) 



We remark that when Q hits q ma ^, the loss probability /C (t) jumps from 
Pmax to a value which keeps Q N (t) = as long as 



E 

c=l 



^ E(J ^" (t) ' (l-^)>L. 



Rf(t) 



We will show below that E(Id,A4^ (t)) is Lipschitz continuous so, in fact, 
K, (t) =p max just before Q N (t) leaves the boundary. 
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Solution for a given N. Let t c (0) = 0, t c (k + l) = t c (k) + T c + Q N (t c (k))/L 
such that t c (k+ 1) — TZ^ (t c (k + 1)) = t c {k). As long as we can define it, the se- 
quence (t c (k)) is increasing in k. Define <l? c (t) =the first k such that t c (k) > 
t. We will construct our solution by recurrence from time ti to tj+i by defin- 
ing = min c t c (<E> c (tj)) starting from time to = 0. 

At time to, we suppose W N (t) and Q N (t) are given (perhaps constant) 
for t <0 = t . We suppose (W N (t), Q N (t)) is defined for t < ti, where t; is 
a time such that ti = t c (k) for some c and some k. This is certainly true at 
time to- Then $ c (tj) and t c (<3? c (tj)) are defined for all classes as is tj+i. 

Then if t < tj+i, 

Xn{t)= K(t-n»(t))' c 

can be defined, because, for each class, t — TZ^ (t) < U for s < t: l+ \ by the 
definition of tj+i [recall (2.6)]. Hence, the point processes (t) and the 
trajectories W N (t) are defined on [0, tj+i]. They are bounded and measur- 
able, thus, the expectations can be defined and, hence Q N (t) can be defined. 
We have therefore checked the induction hypothesis up to time tj+i- 

To conclude, we need to show that t, — > oo. Notice that tj+d+i > min{t c x 
(<& c (ti) + 1) : c = 1, . . . ,d} because otherwise the d + 1 values U + j,j = 1, . . . , 
<i+ 1 must be chosen among the d values {t c ($ c (tj))} and this is impossible. 
We conclude tj+d+i > U + T m ; n and, therefore, tj — > oo as i — > oo. 

Lipschitz continuity of E(Id,M^ (t)) and on [0,T]. 
\E(Id,Mc (* + h)) - E(Id,Mc (*)>l 



1 * 



n=l 
/ 1 ^ 

(3.4) < ±- £ Pfl^OO = VCW, t < » < t + /i)x{n € tf c } 

-^min K c ly n=l 

1 ^ 



c n=l 



(3.5) 



for some t < s <t + /i)x{ n £ ^c}- 



The second term arises because even multiple jumps will create a difference 
less than the maximum window size. 

(3.4) is less than h/T m \ n and tends to zero as h—*Q. (3.5) is bounded 
by the probability a window makes a jump in an interval of length h. The 
intensity function X N (t) is bounded by A(t) = a(t)/T m \ n , so the probability 
of a jump in an interval of length h is bounded by 1 — exp(a(T)/i/T m ; n ). 
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Hence, E(Id,M^ (t)) is Lipschitz uniformly for N € N and t G [0,T]. Prom 
(3.1), it immediately follows that Q N is also Lipschitz. 
Let m*(t) = E{Id,M*?(t)). 

Lower bound on m^f (t) . 

Lemma 2. The derivatives of (t) and Q N (t) are bounded below uni- 
formly in N. 

Proof. Taking expectations, 
EW^(t)-w n 

= E 



o 



1 W N (s-) 

ds- yVn ^ S } dN n (s) 



K%{8) 2 



ds-E 



2 k» (a -Kg (*)) 



ds. 



Hence, 

dEW"(t) 
dt 



E[W»(t-)W»(t - < (t))] — jrj- L-— JC y (t - K (*)) 



> 



K»(t) L " v ' " v nW ' J 27^(t-7^(t)) 
1 EW*(t~) a(t) 



^~max "I - Qjaax./ L 2 ?mi n 

Since EW% (t) < a(t), it follows that dEW*{t)/dt is strictly bounded below 
by — C, where C is a positive constant that doesn't depend on N or n. 
Taking the average of the k c N windows with RTT, T c shows the (t) are 
bounded below uniformly in N. 

Since IZ^ (t) < T max + (/ max /L, it follows that the derivative of J2n=i E -RN{t) > 
is bounded below by the same constant. The fact that Q N (t) is bounded be- 
low uniformly in N follows from (3.1). □ 

Lemma 3. The sequence of functions JC N (t) on [0, T] is sequentially 
compact. 

Proof. As long as Q N (t) < q max , K N (t) = F{Q N (t)), so K N (t) is Lip- 
schitz uniformly in TV. Similarly, as long as Q N (t) = q max , K (t) — 1 — 
i(Ec=l K c m c( t )/' R 'c( t ))~ 1 ^ and this is Lipschitz uniformly in N. The prob- 
lem is the jumps when Q (t) hits the boundary. 

We check the conditions in Theorem 12.3 in [4]. Condition (12.25) is 
trivial since JC N (t) is uniformly bounded. To check (12.26), we must show 
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the oscillations over small intervals after excluding big jumps is as small as we 
like. For any e > 0, we can define the set of times J N = {ji} associated with 
jumps bigger than e/2. The number of points in this set is bounded uniformly 
in N and the spacing between the points of J N is bounded below uniformly 
in N by some 5q. This follows immediately from Lemma 2 because, after a 
jump of IC(t) when Q{t) hits q ma , x of size greater than e, the time to decrease 
to p m ax is strictly bounded below. Select a 5 < 5o such that Q N (t) and 
1 — L(X)c=i K c^ m c {ty/H-c oscillate less than e/4 on intervals of length 

less than S. Next, consider a partition A N by points = to <t\ < ■ ■ ■ <t v = T 
which are <5-sparse; that is, such that mini<j<t,(tj — ij_i) > 5, which includes 
the points in J N . This is possible because these points are spaced out by 
more than 5q. 

Now consider the maximum oscillations over any interval [ti— i,ti). There 
are no jumps of size greater than e/2 since fC N is right continuous and 
the big jumps are among the left endpoints ti-\- Since the jumps only go 
up from p max , they don't add, so, in fact, the greatest possible oscillation 
is e/2 for one jump plus e/4 + e/4 for the oscillations of F(Q N (t)) and 
1 - £(E^=i«f™f (t)/T*£ We conclude w' KN {5), as defined in [4], is 
less than e and this establishes condition (12.26). □ 

3.2. Existence of a limit for the modified system. In this section we shall 
be extracting subsequences of sequences, but we won't reflect this in our 
notation until the end of this subsection. 

Extraction of a limit for Q N and E(Id,A4^f (t)). Q N (t) is deterministic. 
Moreover, the integrand in (3.1) is bounded by a constant B because the 
window sizes up to time t are bounded by a{T) and RTT in greater than 
T min > 0. Hence, Q N is Lipschitz uniformly for N G N and t £ [0,T]. It 
follows that there is a subsequence and a Lipschitz function Q(t) such that 
Q N (t) ->■ Q(t) uniformly using the Ascoli-Arzela theorem plus the fact that 
a uniform limit of Lipschitz functions is Lipschitz. 

We showed above that E(Id,M^(t)) is Lipschitz so again using the 
Ascoli-Arzela theorem we can take a further subsequence of N such that, 
for all c, (t) = E(Id,A4^ (t)} converges uniformly to a Lipschitz function 
m c (t). 

Note that taking the limit as N — > oo in Lemma 2 gives that the deriva- 
tives of m c (t) and Q(t) are bounded below. 

Convergence of the RTT. As a direct consequence of the convergence 
of Q , IZn (i) converges uniformly to lZ n (t), where TZ n (t) = T n + Q(t — 

n n (t))/L. 
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Extraction of the limit for K, N (t) . By Lemma 3, we can extract a further 
subsequence so that /C converges in the Skorohod topology to a limit /C 
in D[0,T]. If Q(t) < q max , then Q N (t) < q max for iV large enough. Since 
K N {t) = F(Q N (t)), it follows that fC(t) = F(Q(t)). 

If Q(t) = g m ax) we want to show K,(t) is given by 

(3-6) max|p max ,l -L^^k c ^^|^J j. 

First suppose t is not a point where /C jumps. Then there exists a small in- 
terval I = (t — 8o,t], where Q(s) = (? max for s E I. There are two possibilities: 

d ^ 1 

K c m c (t) — max > L or y^K c m c (i) — max = L. 

In the first case J2^=i K c m c (t) 1 ^^y^ > 1/ for £ 6 — 5, t] for some S, 
< 5 < Sq, sufficiently small. Since Q N (t) converges to Q(t), we can pick N 
sufficiently large that Q N (t) is in a narrow tube around Q(t). Moreover, 

^ww^ {l - F{QN{t))) -> E ^!(t? (t) (1 - mm - 

The right-hand side is strictly greater than L for t € [jo — <5, p] for 5 sufficiently 
small and the same is true of the left-hand side for sufficiently large N. This 
shows that, for N sufficiently large, Q N (t) will hit the boundary at a time 
inside [t — 5,t). Hence, for N sufficiently large, 

K{t) = lim K N (t) 

N^oo 



N 



In the second case, JC(t) = p ma , x = F(q max ), so 

K{t) =max|p max ,l - L^Y^^c^^j j- 
We have therefore established /C(t) = F(Q(t)) if Q(t) < q max and 



(3.7) /C(t) = max< p 



| Pmax, 1 - L ( ^ ^c^TTT 
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if Q(t) = ^max at all times t where fC(t) doesn't jump. Since the derivatives 
of m c (t) and Q(t) are bounded below, there is a time interval to the right of 
any jump time free of jumps. Since JC(t) is right continuous, it follows that 

K{t) = 1 - 
at jump times. 

Uniform convergence ofW^ . We fix some coordinate n. Clearly, W^(t) < 
a(t) for all < t < T, so it follows that A^ (t) is uniformly bounded by X(t) = 
a(t)/T min for all < t < T and all N > n. Consequently, tf*(t) < N n {t), 
where N n (t) = J l^ j,-.Ju)T n (du,ds). Hence, if for some trajectory u 

of T n , N n (T) < m, then M?(T) < m for all n and N. 
For any trajectory u of T n , we can solve the system 



(3.8) W n (t) - w n 



1 W„(0 



o lK n {s) 2 
where W n (t) = w n , for t < 0, for n = 1, . . . and 

ft poo 



ds dM n {s) 



Nn(P)= J j o l{0,^n(s))( u ) T n(du,ds) 



and 



K n {s - K n {s)) 

Now consider the solution of (3.8) from jump point to jump point of 
M n . Let T m , m = 1,2,..., be the jumps of J\f n in [0,T] corresponding to 
jump points (X m ,Y m ), m = 1,2,..., of T n which satisfy T m = X m and 
Y m < X n (X m ). The solution of (3.8) for t > T m is the deterministic addi- 
tive increase of the window size until time T m+ \ and this has zero chance of 
hitting (X m+ i, Y m+ i) which is chosen according to a Poisson process. Hence, 
with probability one, the trace (t,X n (t)) for < t < T will avoid the points 
of T n (du, ds) for a given w, so we can put a band of width e around the trace 
(t, X n (t)). Now as long as (t) lies in this band, then the point processes 
M n and are identical. This will be the case if (TZ^ (*)) _1 is sufficiently 
close to (7Z n (t))~ l because the solutions (t) and W n (t) starting from 
the same point w n will rise together inside the band between jumps and at 
jumps will be cut in half together. We conclude that, with probability one, 
(t) converges uniformly to W n (t) as N — > oo. 
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Equation for the limit W n . Since TZ^ (s), K (s) and (s) converge 
uniformly to 1Z n (s), JC(s — lZ n (s)) and W n (s), respectively, it follows that 
(t) converges uniformly to 

Kit) = —— — , „ ic(s - n n {s)) 

TZ n (s - TZ n {s)) 
on [0, T]. It therefore follows that both sides of 

ft r°o W N (s~) 

converge uniformly on D[0,T] to (3.8). 
3.3. Equations for the limit (W,Q). 

Determining W c . Since Q(t) is deterministic, the equations in (3.8) are 
independent. This, in turn, means that 

1 N 

EW c {s) = W c (s) = lim -jf- £ Wn{s) X {n € K c }. 

N^oo Kir I\ ' 

c 71 = 1 

Note that this limit is not taken along a subsequence of N. 

This essentially follows from the law of large numbers. It suffices to 
consider the W n (s) = /(W„(0),T n ) defined by (3.8). / is defined on E = 
[0,oo) x D R +[0,T], where W n (0) G R+ = [0,oo) and T£ € D R +[0,T], the 
space of cadlag functions. E is a metric space with metric m = e®d, where e 
is the Euclidean metric on [0, oo) and d is the Skorohod metric on D R + [0, T\. 

We have shown above that, for any initial point wo, the set of trajectories 
(T^° is the point process T n evaluated at the sample point ljq) such that 
the associated graph (t,X n (t)) does not hit any of the jump points of 
has probability one. / is continuous on this set by the arguments used above. 
If the graph (t, X n (t)) avoids the points of T^°, then according to [4] or [10], 
for a point q = (u^Tf^ 1 ) £ E close to p = (wo,T%°), we can find a strictly 
increasing mapping 6 of [0, T] onto itself with sup[ T ] \0(t) — t\< 7(0), where 
7(0) -> as q ->p such that T^(0(t)) is uniformly close to T£°(<). Notice 
that this means the jumps occur at the same times. 

The solution to (3.8) for W n and X n for p and for (w\, T^ 1 (0(t))) are 
therefore uniformly close. Hence, at any fixed time s where has no 
jumps, we will have yV£}{s) and X^{s) uniformly close to VV^°(0(s)) and 
Xn°(0(s)). Since 6(s) is arbitrarily close to s and since the chance of a jump 
arbitrarily near time s tends to zero, it follows that, for q sufficiently close 
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to p, Wn 1 (s) is arbitrarily close to W^°(s). This means / is continuous at 
(u>o, T^°) as long as has no jumps at s and this has probability one. 
By hypothesis, 

1 N 

C 71=1 

and the T n are i.i.d. independent Poisson processes, so the empirical measure 
of the pairs (W n (0),T n ) converges; that is, 

1 N 

E s (W n (o),r n )X{n e K c } -> /i c 1/, 

c n=l 

where v is the distribution of T n on D^+[0,T]. The result now follows since 
/ is bounded and the set of discontinuities has probability zero relative to 
the limiting measure. 

Determining M. c . In addition, the above means that (t) converges 
weakly to M. c (f) almost surely P. For any continuous function with compact 
support, define the limiting measure M c (t) by 

1 N 

{g,Mc(t))= lim -^E^WM" 6 ^}- 

c n=l 

The deterministic limit exists almost surely by the argument above. This 
also means that m c (t) = (Id, M c (t)), so when Q(t) = q max , K(t) satisfies 

Kit) = max j Pmax , l-Jp^c (Id, M ^))^^^j } • 
Moreover, 

1 N 

W c (t):= lim J2^n(t)x{neK c } 

c n=l 

= (Id,M c {t)). 

Existence of a strong solution. Hence, along the subsequence N , we ob- 
tain an almost sure limit point (W,Q,(Mi, . . . ,Md)) which satisfies the 
modified system: (3.8) and 

Q(t) - Q(0) 

d 

(3.9) 



+ (EX^T^ 1 - F{Q(s))) - Lj x{Q(s) = 0} 



ds, 
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where 

1 N 

W c (t):= lim Y,Wn(t) X {n£K c } 

n=l 

(3.10) 

= (Id,M c {t)). 

We call the above a strong solution associated with the derived subse- 
quence. Moreover, (3.10) is true along all sequences. Hence, the solution to 
(3.9) and (3.8) is a strong solution of the system in Theorem 2. 

Extension to the timeout and slow-start phases. If we consider the ex- 
tended system with timeouts and slow-start, we have to define N^(t), the 
proportion of the connections from class c in congestion avoidance at time t. 
There will be similar proportions N^(t) in timeout and N^(t) in slow-start. 
The equation for the queue (neglecting boundary terms) becomes 

d ^P- =jj£N*WlM» (t)> + n» N* (t)(Id,H?(t))} -L, 

c=l 

where M^(t) is the histogram of the window sizes of connections in conges- 
tion avoidance and (t) is the histogram of the window sizes of connections 
in slow start. 

We can force the queue to be deterministic by considering the modified 
system (again neglecting boundary terms): 



d Q CO = Vr„JV vt\rA( + 
dt z - 



K?E(tf?(t))E(Id, M N C (*)> + k"EW* (t))E(Id, U N C (t))] - L. 

c=l 



The window equations for the modified system are uncoupled as before. We 
can again pick subsequences so that Q N (t) converges and then further sub- 
sequences so that E(N^(t)), E(N^(t)) and E(Nf?(t)) converge. As before, 
the limiting system is in fact a strong solution to the extended system. 

4. Uniqueness of strong solutions. We have constructed a strong solution 
(W, R, Q, M) to (2.13) and (2.12). Our approach is to prove L 1 convergence 
of (W N ,Q N ) to the strong solution on [0, T]: 

Proposition 1. Under Assumptions 1 and 2, Dn{T) ^ as N — > oo, 
where 



where 



D N (t) :=Esu P \Q n (t) - Q(t)\ + E\\W N (t) - W 

T<t 



1 N 

|W»-W(t)|| = -£ sup |W^(r)-W n (r)|. 
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We conclude that Q and, hence, i?„ converge in probability to Q and 
R n , respectively. Hence, for each window solving (2.12), the intensity 
(s) converges to 

W n (s-R n (s)) 
R n {s - Rn{s)) 

This implies that each window converges in probability to W n in Sko- 
rohod norm. 

We also have a stronger result than the weak convergence of to W: 



COROLLARY 1. If Assumptions 1 and 2 hold, then \\M^(t) — M c 
in probability for any t <T. 



Proof. For any bounded Lipschitz function g, 
lim sup \E(g, M?(t)) - E(g,M c (t))\ 

N-*oo 



lim sup 

N— >oo 



E 



1 N 



K*N 
c n=l 



1 



N 



< lim sup 

N— »oo 



N 



n=l 



l- J2 E[g(W^(t)) - g{W n (t))]x{n € K c } 



c n=l 



lim sup 

N— >oo 



N 



N 



E 9(W n (t)) X {n G K c } - E(g,M c (t)) 



n=l 



< limsup ^- E ^(^(t)) - g(W n {t))\ 



N ^oo K*N n 



< C ff Umsupl E - W„(t)|, 



n=l 



where C 9 is the Lipschitz constant divided by min{K^ }. Hence, 
limsup \E(g, M?(t)) - M c (t)) | 

N^oo 



< C g lim sup 1 £ E SU P \ W n{r) - W n {r) \ 



< Cglimsup^HW^^) - W(t)|| 

N—>oo 
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-►0. 

Since we can construct a convergence determining sequence based on pos- 
itive bounded Lipschitz functions, the result follows immediately. □ 



A similar argument shows that \\M?(t - Rc(t), -;t, ■) - M c (t - R^(t), •; 
t, ')\\w — > in probability for any t<T. 

To prove Dfj(t) — > as N — > oo, we will establish a Gronwall inequal- 
ity: Dn{£) < Bjv(t) + C JqDn(s) ds, where B^it) — > as N — > oo. It is eas- 
ier to establish the Gronwall inequality for Gentle RED where the drop 
probability function Fg(q) rises from p max at (? max to 1 at q mSLX + 5 and, 
hence, is Lipschitz. For Gentle RED, the results in Section 4.1 apply with 
p = T. Moreover, we will see below that B N (t) =Ef*\S N (s)- S(s)\ds, 
where Sat and S were defined by (2.14) and (2.15). This means we can 
even get a rate of convergence using the Gronwall inequality since -Djv(i) < 
B N (t) + Cf Q B N (s) exp(C(t - a)) ds. 

Unfortunately for RED, when Q N hits the boundary g max , the dynamics 
change because F is not Lipschitz at (/max- Our solution to proving Propo- 
sition 1 is to prove convergence on [0,p], where p to the (deterministic) 
time when Q(t) first hits g max . This is done in Section 4.1. In Section 4.2 
we show the convergence of the transmission rates of the prelimit extends 
for a time T m \ n beyond p (and, in fact, T m \ n beyond any point in time). 
This holds because the transmission rates are determined one RTT in the 
past. This allows us to extend our proof to cross the boundary when Q hits 
^max- Then, in Section 4.3, we prove convergence on the interval [0,<r], where 
p < a and a is the first time Q(t) leaves the boundary; that is, Q(t) < g m ax 
for t 6 (a, a + 5] for some 5 > 0. 

We now prove a couple lemmas we will need. 

Lemma 4. Q(t) > -L + S, where 5>0 and Q(t) < a(t)/T min - L. 



Proof. Taking expectations of (2.12) and following the steps of Lemma 
2, we get 

1 EWJr) a(t) 

EW n {t) -w n > 



Tjasx 9max/ L 2 T 1 



mill 



Since the above bound is positive when EW n (t) is small, it follows that 
EW n (t) is uniformly bounded away from zero for all t and n. It therefore 
follows that W c (t) is uniformly bounded away from zero for all t, n and c. 
From (2.13), this means that Q(t) > —L. 

The second inequality follows immediately from (2.13). □ 
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Lemma 5. For < s < t < T, we have that 

?Nf\ D / m „„„ Ie>JV/„ r>N, 



sup \R% (s) - R n (s)\, sup \RZ (s - R% (s)) - R n (s - R n (s))\ 

0<s<t 0<s<t 



and 

sup \Q N ( S -R"(s))-Q(s-R n (s))\ 

0<s<t 

are bounded by Csup < s < t _y \Q N { S ) ~ Q( s )\- 

Proof. Let re (s) = T n + Q(s)/L, 5re (s) = s + </> n (s), $ (s) = T n + Q N (s)/L 
and g^ (s) = s + (f>^ (s). For any t, let g n (u) = t and let (u ) = i. Note 
that u < t - T„ and < t - T n . Hence, i^(t) - = ($(u N ) - = 

u N — u. Suppose u < u N (or u N < u), the function g n increases (or de- 
creases) from g n (u) = t = g^ (u N ) to g n (u N ) [from g n (u N ) to g^ (u N )] by an 
amount of at least 8(u N — u)/L since, by Lemma 4, the derivative of (j> n {u) 
is bounded below by S/L. It follows that j\u N — u\ < \g£ (u N ) — 9n{u)\ < 
sup < s < t -T„ \9n ( s ) ~ 9n(s) \ < sup < s < t _ Tn \Q N (s) - Q(s)\/L. Hence, 

\R%(t)-R n (t)\ = \u N -u\< sup \Q N ( s )-Q(s)\/5 

0<s<t-T n 

and this gives the first result. 

In the same manner we obtained (2.5), we have 

1 



(l-R n {t)) 



1 + Q(t-R n (t))/L 
and from Lemma 4, 

R n (s)- ^- R ^))' L 



l + Q{s-R n {s))/L 
<\a(s)/T min -L\/(5/L). 

Consequently, using the mean value theorem for s < t, we have 

\R n (s - R% (s)) - Rn(s - R n (s))\ < C\R% (s) - R n (s)\ 



< C sup \Q N {u)-Q{u) 

0<u<t-T n 



Finally, 



sup \R^(s-R^(s))-R n (s-R n (s))\ 

0<s<t 

< sup \R%( S -R%( s ))-R n ( s -R%(s))\ 

0<s<t 
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+ sup \R n (s - R% (s)) - R n (s - R n (s))\ 

0<s<t 

< sup \R%(s)-R n (s)\ 

0<s<t 

+ sup \R n (s - R% (s)) - R n (s - R n (s))\ 

0<s<t 

<C sup \Q N (u)-Q(u)\ 

0<u<t-T„ 

by the first result and the inequality above. 
The third result follows because 

sup \Q N (s-R"( s ))-Q( s -R n (s))\ 

0<s<t 

< sup \Q N (s-R%(s))-Q(s-R»(s))\ 

0<s<t 

+ sup \Q( S -R^(s))-Q(s-R n (s))\ 

0<s<t 

< sup \Q N (s)-Q(s)\ + C\R^(s)-R n (s)\ 

0<s<t-T„ 

since the derivative of Q is bounded and R n > T n 

< C sup \Q N (u) -Q(u)\ by the first result. □ 

0<u<t-T n 

4.1. Convergence away from the boundary. We start with Q N (0) = q(0) 
in the interior: 

• < q(0) < q max . 

Define p, respectively p , to be the stopping time when Q(t), respectively 
Q N {t), first hits 

<7max- We must define the distance between the marginal pro- 
cess W N (t) = (Wf(t),. . . ,W$(t)), Q N (t) and M N (t) = (Mf(i), . . . ,M"(t)) 
and the limit processes up to the stopping time p A p . For any t < p, define 

1 N 

\\W N (t) - W(t)\\ = - X: sup \W»(t) - W n (r)\, 

n= lO<r<tAp JV 

where r is a stopping time with respect to Tt- Define 

D N (t):=E sup \Q N (r)-Q(T)\+E\\W N (t)-W(t)\\. 

T<tf\p N 

We will establish a Gronwall inequality: 
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Proposition 2. D N (t) < B N (t) + C^D N (s)ds for t < p and 
su Pt<p B N (t) — > as N — > oo ; where C is a canonical constant throughout 
this calculation (which unfortunately depends on F'). Moreover, p N con- 
verges to p in probability. 

We just group all universal constants which do not depend on N into C. 
We note that one of the factors in C is the Lipschitz constant, F'(q),q < <7 max - 

Estimate for Q. 

E sup \Q n ( t )-Q(t)\ 

0<T<tAp N 



+ E sup 

0<r<tAp N JO 



(4.1) 



< E sup 

0<r<tAp N JO 



+ E sup 

0<r<tAp N JO 



and 



+ E sup 

0<r<tAp N JO 



E sup \Q n ( t )-Q(t)\ 

0<T<tAp N 



\S%(s)(l-K N (s))-S(s)(l-K(s))\ds 
S%(s)-S N (s))(l-K N (s))\ds 
\(S N (s)-S(s))(l-K N (s))\ds 

\S(s)(K N {s)-K(s))\ds 



< E sup f 

0<r<tAp N JO 

+ E sup 

0<r<tAp N JO 



N 



N ^ 



n=l 



W*(a) W n (s) 



R*t(s) R n (s) 



ds 



Sjv(s) — S(s)| ds 



+ £ sup / S(s)|i^(s) -if(s)|ds 



1 



n=l 



W^r) W„(r) 



i^(r) R n {r) 



ds 



+b n +e( sup r-L a ( S )i^ jv ( S )-K( S )i) ( i S , 

\0<T<tAp N JO J min 
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where B N = E£ \S N (s) - S(s)| ds 
Next, 



4 



sup 

\Q<r<sAp N 



W^{t) W n {r) 



<e( sup \W»(T)-W n (T)\-±-) 
+ e( sup \W n (r)\ 

\0<T<sAp N 



R»(t) R n {r) 

(4.2) ^w~ E ( su p \w?(T)-Wn(r)\) 

+ $-e( sup ^(r)-^)!) 
<^-e( sup |W^(t)-W b (t)|) 
+ -J-a( S )C^f sup |Q"( r )-Q( r )| 



min N 



0<T<sAp 



using Lemma 5. 
Moreover, 



f?f sup [ T J- a (s)\K N (s)-K(s)\)ds 

\0<T<tAp N J0 J min / 

<Ce( sup f T \K N {s)-K(s)\ds 

\0<T<tAp N JO 



(4.3) 

<C£( sup f T \F(Q N (s))-F(Q(s))\ds 



<c[*e( sup |Q*(r)-Q(r)|ds 



Hence, 



£( sup \Q»( t )-Q{t)\ 

<0<r<tAp N 

* 1 1 N 



<! ^-^Y. E ( SU P \W^(r)-W n (r)\)ds 
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(4.4) + f JL a (s)CE( sup \Q N ( T )-Q(r)\)ds + B[ 

Jo i min \0<r<sAp N / 

+ cf t E( sup \Q N (t) - Q(t)\) ds 

JO \0<T<sAp N J 
ft 

<Bf + C / D N {s)ds. 
Jo 



Estimate for W . From (2.12) 

A? 



1 / 

sup |W^(t)-W„(t) 

^ n=1 \0<T<tAp N 



(4.5) 



\0<r<f Ap* JO ^ n=1 
1 1 £*Y 



(4.6) 



ds 



R%(s) R n (s) 



(4.7) 



n=l ^0<T<tAp^ 

-W re ( S -)diV re ( S )] 

Again, (4.5) is bounded by 

r^w sup kw-^t)^ 

<c [ e( sup iq'Vj-qOOiW 

JO \Q<t<sAp n / 



By the definition of 



dN^(s) 
00 W»(s-) 



s=0 Ju=0 



X[o,\N {s)) (u)T n (du,ds). 



Consequently, 



W J2e( sup I [wX{s-)dN»{s)-W n {s-)dN n {s)] 



n=1 \0<r<tAp N 
1 N ( 



n=l 



\0<T<tAp N JO Ju=0 



l^n ( s )X[0,A£ (*))(«) 
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Wn(s )X[o,\ n ( s ))(u)\T n (du,ds) 



N 



N 



J2 E 

n=l 



1 N 

-Ye 

n=l 



tAp N roo 



u=0 



tAp N roo 



u=0 



\ W n ( s )X[0,A£(*))(«) 

- W n (s~)x[ 0! \ n ( s )){u)\r n (du,ds) 



\Wn(s )X[0,\%(s))(u) 

- W n (s~)x[ 0t x n (s))(u)\ duds 



Hence, 



N 



N 



n=l \0<r<tAp 1 ' 



1 N 

<-Ye 



W^(s-) dN» (s) - W n (s~) dN n {s)} 



tN 



n=l 



tAp 



\W?(s-) - W n (s-)\\»(s) A X n (s) ds 



A/ 



N 



+ mT, E (I \ W n (O V W n (s~)\ ■ \\» (s) - \ n ( S )\ ds 

N n=l \ J ° 



(4.8) 



(4.9) 



<ff^X>( SU P l^(r)-^ n (r) 

^0 ^min JV 1 VlK^sAn" 



JV 



n=l ^CKr^sAp^ 

JV 

a(s)|A^ v (s) - A„(s)|ds 



ds 



where A^ (s) and A n (s) are less than (w n + s/T m - m )/T m i n = a{s)/T u 
Also, 



\\%{S)-X n {s)\ 



<\W»(s-K(s))-W n (s-RZ(s))\ 



A'/ 



^( S -<(,)) 



A/ 



+ \W n (s - < ( S )) - W n ( S - ^n.(g))| jRjV(s _ jRjV(g)) ^ jV ^ " ^ (*)) 
+ |W n (s-.R n («))|- 



1 



\K N (s-R':(s))\ 



N i 



l R%(s-R%(s)) R n (s-R n (s)) 1 

+ W n (s - R n (s))— J -L-—\K N (s - R% (s)) - K(s - R n (s))\. 
Rn(s - R n (s)) 



Hence, 



(4.10) | A? (*) - A n ( S )| < \W^ (s - K (s)) - W n (s - R» {s))\/T n 



-N 



A, 
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(4.11) + \W n {s - < 0)) - W n (s - R n (s))\/T n 

(4.12) + a{s)\R% (s - R% (s)) - R n (s - Rn{s))\/T* 

(4.13) + a{s)\K N (s - R»{s)) - K(s - R n (s))\/T n . 

We must bound £(/ tAp |A*(s) - A n (s)| ds), so we must bound the ex- 
pectation of the integral of each of the above terms. The first (4.10) satisfies 



^(^ \W»(a - < (a)) - W B (* - < (-))| cfa/r„J 
<J-/V sup iW^CrJ-WnCrJIds). 

-t min JO \Q<t<sAo n / 



>0<r<sAp 

The second term (4.11) is bounded by 

HAp N 



" \ W n( S - R n(s))-W n {s-R n { S ))\ds^/T n 

<e( ( -J— duds 

2 Vo y[( 8 -flW( a ))A(«-B„(*)),(«-fl^W)V( 8 -fln( S ))] 



W n {u~)dN n {u) ds 

Note that 

X {(s - R% (s)) A (s — i? n (s)) < « < (a - JS* (s)) V (a - R n (s))}ds 

= (u + ^ (u) V 4>n{u)) A (t A p") - (n + ^ (u) A </>„(«)) A (t A p^), 

where n (u) =T n + Q(u)/L and ^(«) = T n + Q N (u)/L. 
Hence, 



/ rtAp N \ . 

E \L |Ty ™ (s ~ R ™ (s)) _ ^ n(s ~ ^ n(s))l ds ) / Tn 

/ \R n (s)-R%(s)\ds 



I' r t^(n)V0 n (n)-^(t 



1 / rfAp^ 

-tmin \J0 
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* \J0 J-min / 

<^/"V sup |Q(*)-Q*( fl )f)ds 

Jmin JO VO^r^sAp^ / 

+ f\P*7 E { sup ItfVJ-QOOl)* 
<C f e( sup IQ^M-QOOlW 

JO Vo^r^sAp^ / 

where C is a constant. 

To bound the third term (4.12), for s <t A 

(a - R% (a)) - Rn(s - R n (s))\ < Csup \Q N (r) - Q(r) 
by Lemma 5. Taking expectations gives 

e(£ V - <(»)) - -Rn(» - BnW)|) * 



£C /„' £ ( 



sup |Q JV (r)-Q(r)| ] da. 

O^r^sAp^ 



Similarly, to bound the fourth term (4.13), for s < t A p^, 



(4.14) 



<C[\K N (s-R»(s))-K(s-R»(s))\ 

+ \K(s-R%(s))-K(s-R n (s))\] 
<C[\Q N (s-R%(s))-Q(s-R»( S ))\ 
+ \Q(s-R%(s))-Q(s-R n (s))\] 



<C 



S up\Q N (T)-Q(r)\ + \R^(s)-R n (s)\ 

T<S 



(4.15) <Csup|Q JV (r)-Q(r)|, 

since Q has a bounded derivative. Taking expectations shows the fourth 
term is bounded by 

C [ e( sup \Q N (T)-Q{T)\)ds. 

JO \0<T<sAp N / 
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Hence, we can bound (4.9) by 



C 



E( sup \Q N ( T )-Q(r)\ 

\0<T<sAp N 



1 N / \ 

+ af5>( SUP E\W^(r)-W n (r)\) 

n=1 \0<T<sAp N / 



< C / D N (s)ds. 
Jo 

Putting together (4.8), (4.9) and (4.7), we get 

1 N / \ 
-5> sup \W^(T)-W n (r)\) 

M n=1 \0<T<tAp N J 



<C 



E[ sup \Q N (t)-Q(t) 

0<T<tAp N 



1 N f 

+ nY, E ( _ , su P J^W-^ n (r)| 



n=l 



ds. 



Hence, 



E\\W N {t) -W(t)\\ <C [ D N (s)ds. 

Jo 



Finally, add in (4.4) and we get our Gronwall inequality: D^(t) < B N (t) + 
CJ^D N (s)ds. 

4.2. When crossing or grazing the boundary. The construction of the 
Gronwall inequality in the previous subsection is fairly standard and is suf- 
ficient for proving mean field convergence when Gentle RED is used. This 
subsection, however, resolves the fundamental problem when Q(t) just grazes 
Qmax when RED is used. For example, if we changed the dynamics on the 
boundary to cause the queue to rapidly grow when Q hit (/max? 

then Q 

would not converge to Q along many sample paths. Those paths where Q N 
just missed g max would drop, while those that hit f/ max would rise. 

Fortunately, our system allows us to resolve this problem. Using the delay, 
we show below that supQ< T <p_(_2 i mil 

\Sn(t) - S(t)\ -» as N -► oo. In other 
words, we can extend the convergence of the transmission rate one RTT into 
the future beyond the first time to hit the boundary because the evolution 
of the windows for [p, p + T m j n ] is already determined at time p. This forces 
Q N to follow Q for one RTT after hitting the boundary. Hence, if S(t)(l — 
Pmax) > L for t € (p,p + 5), where S > 0, then we are assured the prelimit Q N 
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hits the boundary close to where Q hits the boundary and that p N converges 
to p. Moreover, we are assured that, for N large enough, there will be a last 
exit time rj N , where p N < r] N < p + 5 when Q N jitters off the boundary 
and that E\r) N — p\ — ► 0. We can therefore define a N unambiguously as the 
infimum of those times after p + 6 that Q N leaves the boundary. 

When Q grazes the boundary at time p, then S(t)(l — p ma , x ) < L for 
t € (p, p + S] for some S > 0. Consequently, a = p. This case poses a mathe- 
matical difficulty because the prelimit Q N either hits the boundary at a time 
close to p or else avoids the boundary altogether. It is therefore difficult to 
define p N and a N . We resolve this problem at the end of this subsection by 
effectively skipping over p and saying Q didn't really hit the boundary and 
the prelimit will at most spend a vanishingly small time on the boundary. 

This extension of the convergence of S N (t) to S(t) for times t £ [p, p + 
T m i n ] is, in fact, valid for any time. The proof doesn't change. In Section 4.3 
we use this fact to extend the convergence to a and then by iteration to T. 
(For future reference, it might even be useful to introduce a delay into a sys- 
tem without delay to obtain this property and then prove weak convergence 
as the delay tends to zero.) 

To show convergence one RTT into the future, for t < p + T m \ n , define 



H(t) = E 



N 

E 

n=l 



sup 

0<T<t 



W^(t)-WJt) 



For t < T m i n , we can use the same steps as the estimates (4.5) and (4.6) to 
get 



1 N 

H(t)<E sup \W*{T)-W r 

N n=1 0<r<p 



{r)\ 



1 / fP+T mi 



(4.16) 



(4.17) 



n=l ^ J P 

N 

2N 



1 



1 



ds 



n=l V< T <* 



Again, (4.16) is bounded by 
a(T) 



-E 



W^(s-)dN^(s)-W n (s-)dN n (s)} 
sup \R%(t) - Rn{r) 



0< T < p +T min 



<CE[ sup \Q N (t)-Q(t) 

\0<T<p 

by Lemma 5. We have already shown convergence up until p, so this tends 
to zero as iV — > oo. 
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We estimate (4.17) as we did (4.6) to obtain the following terms corre- 
sponding to (4.8) and (4.9): 



1 



-Ye 



(4.18) 



(4.19) 



n=l 



< 



sup 

p<r<t 



ais 



[W^(s-)dN»(s)-W n (s-)dN n (s)] 



N 



p ^min 



1 N 

-Y 



n=l 



n=l J P 



sup \W^(r)-W n (r) 

P<T<S 



E(a(s)\\%(s)-\ n (s)\ds). 



ds 



We can bound (4.19) using the decomposition given by (4.10), (4.11), 
(4.12) and (4.13). Since each of these terms involve times more than T m ; n in 
the past, it is not hard to see that the integral of term (4.10) is bounded by 
CE iw En=i su Po<t< P \Wn(t) ~ W n (t)\] and that term (4.11) is bounded by 
C£(sup < T < p \Q N (r) - Q(t)\), as is the integral of (4.12) and (4.13). All of 
these bounds tend to zero as N — > oo. 

We conclude that H(t) < B N (t) + C J * H(s) ds for 0<t<p + T min , where 
B N (t) again denotes a term which goes to zero as N — > oo. Using the Gron- 
wall inequality, we conclude H (t) —> as — > oo. Hence we have convergence 
of the windows over [p, p + T m i n ] . Refining the estimate (4.2), we get 



E 



sup 

0<T<p+Tn 



W n {T) 



<(r) 



< 



1 



T 

-L IT 



-E 



sup 

0<r<p+T min 



Mr) 
|W^(r)-W^(r)| 



+ -^-a{s)CE( sup \Q»( t )-Q{t)\ 

i min \0<r<p 

using Lemma 5. Both these estimates tend to zero as iV — > oo, so we have 
shown sup < r <p_i r y min \ S n (t) — Sn(t)\ — > as — > oo. This completes the 
argument. 

Once we have convergence of the transmission rate until p + T m ; n , the case 
where Q enters the boundary becomes obvious and, clearly, p N converges to 
P- 

The case where Q grazes the boundary so p = a is also resolved because, 
in probability, 



'JV^A- 1 -Pmax) -> S(t)(l -Pmax) <L for p < t < p + T min . 

Hence, if Q N enters the boundary at time p , it leaves almost immediately 
so a N — p — > in probability. Consequently, we can continue the iteration 
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described in Section 4.3 until the next time, p\, Q really hits the boundary 
and the contribution to the terms 

EU \K N (s)-K(s)\ds) 

and 

E^j \K n (s - R% (s)) - K(s - R n (s))\ dsj 

by the integral over the interval [p N ,Tj N ] is negligible. 

The case where Q stays on the boundary and S(t)(l - p max ) = L for 
t £ [p, a] is also theoretically possible. It poses no problem, however, because 
Q can be considered to be on or off the boundary so the estimates in the 
previous subsections apply. 

4.3. Mean-field convergence on the boundary. We now prove convergence 
on the interval t € [0,cr]. Assume a > p. In Section 4.2 we showed how to 
handle the case when the queue grazes the boundary. Define o~ N to be the 
infimum over times greater than p + 5 that Q N is less than q ma , x , where 
5 < a — p. 

For any < t < a, redefine 

1 N 

\\W N (t) - W(t)\\ = - £ sup \W»{t) - W n (r)\, 

71=1 0<T<tAa N 

where r is a stopping time with respect to Tt and a N is the end of the first 
sojourn on the boundary by Q . Redefine 

D N {t):=E sup \Q N (r)-Q(T)\+E\\W N (t)-W(t)\\. 

T<tAcr N 

Again, we will establish a Gronwall inequality: D]^(t) < {t)-\-C Jq Dn(s) ds 
for t G [0, a] where sup t<(T B^(t) — > as N — > oo. 

The calculation is almost the same as in Section 4.1. We need only improve 
the bounds on the terms (4.3) and (4.9) via (4.14). For s < t A a N , where 
t < a, there are three possibilities; both Q N (s) and Q(s) are away from the 
boundary or both are on the boundary or one is on the boundary and the 
other isn't. If Q N (s) < g max and Q(s) < g max , then 

\K N (s)-K(s)\ = \F(Q N (s))-F(Q(s))\<C sup \Q n (t) - Q(r)\. 

T<sAcr N 

If Q N (s) = Q(s) = qwx, K N (s) and K(s) are given by 

Sx(s){l-K N {s))=L and S(s)(l - K(s)) = L, 



40 D. R. MCDONALD AND J. REYNIER 

with K N (s), K(s) >Pmax- Note this means 

(4.20) s"(s)>L/(l-p max ) and S(s)>L/(l 
Hence, if Q N {s) = Q(s) = g max , 

\K N (s)-K(s)\ = \L/S%)-L/S(s)\ 

< {1 - P ™ x)2 \S%(s)-S(s)\ 

< C(\S%) -S N (s)\ + \S N (s) -S(s)\) 

<CD N (s) + \S N (s)-S(s)\, 

using the same estimate as (4.1). 

Hence, to bound expression (4.3), for t<a, 

e( sup [ T \K N (s)-K(s)\ds) 

\0<T<tAcT N JO J 

<C [ D N (s)ds + f t \S N (s)-S(s)\ds 
Jo Jo 

(4.21) +e( sup f T [x{Q N (s)<q m , x = Q(s)}}ds] 

\ 0<r<tAa N JO / 

+ E( sup f T [x{Q(.s)<q m ^ = Q N ( S )}}ds) 

V 0<T<tAcr N JO / 

< C t D N {s) ds + B N (t) + E\p N -p\ + E\p N -r, N \. 
Jo 

To bound (4.9), we need to improve our bound on (4.14). As before, 
\K N (s-R%(s))-K(s-R n (s))\ 

(4.22) <\K N (s-R»(s))-K(s-R»(s))\ 

+ \K(s-R»(s))-K(s-R n (s))\. 

If Q(s - R% (s)) < q max and Q(s - R n (s)) < g max , then since F(q) is 
schitz for q < q max , 

\K(s-R%(s))-K(s-R n (s))\ 

= F(Q(s - B% (s)) - F(Q(s - R n (s)))\ 

<C\Q(s-R%(s))-Q(s-R n (s))\ 

< C\Rn (s) — R n (s)\ because Q is differentiable 

<C sup \Q n (t)-Q(t)\. 

T<sAa N 
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However, if Q(s - R% (s)) = Q(s - R n (s)) = g max , 

\K(s-R%(s))-K(s-R n (s))\ 

= \L/S(s-R%(s))-L/S(s-R n (s))\ 

< {1 ~ P l aax)2 \S(s - R%(s))-S(s - R n (s))\ 

<C\R%(s)-R n (s)\ 

<C sup \Q n (t)-Q(t)\. 

t<sA<j n 

We used the fact that S(s) = J2t=i K c r c (s) is Lipschitz, as was checked in 
Section 3.2. 

We can therefore bound the integral of the second expression in (4.22): 



/ rtAa" 

E^J \ K ( S -R%( S ))-K(s-R n (s))\ds 

<C [ D N (s)ds 
Jo 

+e( sup f T [x{Q N (s-Rn(s))<q m , x = Q(s-R n (s))}}ds 

\0<T<tAo N JO 

+ e( sup [ T [x{Q(s-Rn(s))<q m ^ = Q N (s-R%(s))}}ds). 

\0<T<tAa N JO / 

However, Q(s — R n (s)) < <?max = Q N { S — Rn ( s )) implies s — R n (s) < p 
and s — il„ (s) > p N ; that is, when s € (p N + ($[ (p N ),p + 4> n (p)). Hence, 

[ T [X{Q(S - Rn(s)) < feax = Q N (s ~ R% (a))}] ds 

Jo 

<\p^-p\ + \^(p^)-^ n (p)\ 

<\p n -p\ + \\Q n {p n )-Q{p)\ 

<\p N -p\ + j\Q(p N )-Q(p)\+C sup \Q N ( T )-Q(r)\ 
<C\p N -p\ + C sup \Q n ( t )-Q(t)\. 

T<sAa N 

Moreover, Q (s — R% (s)) < q m ax = Q(s — R n (s)) can occur when s — 
Rn(s) < p and s — R% (s) > p N or p N < s — i?„ (s) < 77^; that is, when s 6 
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(p N + ^(p N ),p + Mp))or when s G (p N + ^ (p^) , ^ + <^ (r^)) . Hence, 

[ T [X{Q N (S ~ R% («)) < <7max = Q(a " i2n(5))}] ^ 
•/ 

< 1^ - pI + Ip" - v N \ + \^{p n ) - Mp)\ + \^(v N ) - €{p n )\ 

<2\p N -p\ + 2\p N - V N \ + \\Q N (p N ) - Q( P )\ 
+ \\Q N (r, N )-Q N (p N )\ 

< 2\ P N - P \ + 2\ P - n N \ + j\Q(v N ) - Q(p)\ 

+ j\Q(p n )-Q(p)\ + C sup \Q n (t)-Q{t)\ 
<C(\p N -p\ + \p-rj N \) + C sup \Q N (r)-Q(r)\. 

T<sAcr N 

Adding these terms together, we get 

r tAp N 

E \K(s-R%(s))-K(s-R n (s))\ds 
Jo 

<C(\p N -p\ + \p-i 1 N \) + C f D N (s)ds. 

Jo 

We can easily bound the integral of the first expression in (4.22) using 
the same estimates we made for (4.21): 

e( sup r\K N (s-R»(s))-K(s-R»(s))\ds) 

\0<T<tAp N JO / 

<cf D N {s) ds + B N (t) + C(E\p N - p| + E\rj N - p\). 
Jo 

Adding these extra pieces together, we see D N (t) < B^(t) + C J*q Dn(s) ds, 
where B?(t) = 2B N (t) + C(E\p N -p\ + E\r] N - p|). The first iteration es- 
tablished that p N and r/ N converge in probability to p, so B^(t) — > as 
N — > oo. Consequently, sup t<cr D N (t) — > and we now have convergence on 

Using the results in Section 4.2, we can show sup 0<r<o . +T . \Sn( t ) ~~ 
S(t) \ — > as N — > co. Hence, if S(t)(l — p m ax) < L for t G (er, a + S) where 5 > 
0, then we are assured the prelimit Q N leaves the boundary close to where Q 
leaves the boundary and that a N converges to a. We can now iterate to show 
convergence up through any sequence of entrances and departures from the 
boundary. It is conceivable that there may be a limit point where a sequence 
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of entrance and departure times converge to some time 0. We can use our 
theory to prove mean field convergence as close as we want to 0. Then, again 
using the delay, we can show sup < T <e + y . \S n (t) — S(t) \ — > as N — > oo. 
We can therefore establish convergence beyond G. There can never be a last 
time beyond which we cannot establish the mean field convergence. 

5. RED is a weak limit of Gentle RED. Define the drop probability 
function for Gentle RED by F$(q), which rises from p max at g max to 1 at 
<7max + 8. Section 4.1 gives mean field convergence for Gentle RED over any 
interval [0, T] since F$ is Lipschitz. Here we show that, as 5 — > 0, Gentle RED 
becomes RED and Fg tends (weakly) to F, the drop probability function for 
RED. 

For any 5 and any N, redefine the solution to the iV-particle system in 
Section 2, (W N (t), Q N {t)) by (W 5 ' N ,Q S > N ), so now (W N (t), Q N (t)) only 
denotes the solution with loss function F. These processes are constructed 
iteratively on the almost surely finite number of segments defined by jumps 
of T n (s, A(T)); n = 1, . . . , N, where T n , n = 1, 2, . . . are defined on the prob- 
ability space (n,F,P). Let R S > N = (R{' N ,...,R% N ) be the corresponding 
round trip time delay of the connections. Let 

gS,N ( ] _^ jy W c (t) 
c=l R c ( s ) 

Let P S > N be the measure induced on D[0,T] x C[0,T] by (S^ N ,Q^ N ) 
(where coordinates greater than N are identically zero) . In the same manner 
as Lemma 1, we can show the measures P S < N are tight. 

Using this lemma we can now prove the following: 

Lemma 6. (W' 5 ' Ar ,g <5 ' 7V ,R' 5 ' Ar ) converge weakly to (W N (t),Q N ,R N ) as 
(!) — > 0. The lemma holds even for N = oo. 

PROOF. By hypothesis, Q S ' N (t) = Q N (t) = q(0) and W^ N (t) = W*(t) = 
w n for t < 0. We show the drop probability K S,N (t) = F$(Q S ' N (t)) converges 
as 5 — > 0. 

First, pick a subsequence 5k such that P &k > N converges weakly to P°' N 
and, moreover, such that (W Sk,N , S^' N ,Q Sk ' N ) converges almost surely to 

(w°> N ,s£' N ,Q°' N )- 

If Q s ^ N (t) < g max , then K s * N (t) = F Sk {Q s *> N '(*)) = F(Q s «> N (t)). On the 
other hand, if Q Sk ' N {t) > Qmaxi then there will exist a time p^ k '^ (t) < t 
when Q Sk,N (t) last hit q max . Consider the solution to (2.9) over the interval 
[p Sk ' N {t),t] and let V s >" N (s) = 1 - F Sk (Q 5k ' N {s)). Note that, for p 5 >" N (t) < 
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s<t, 



^£l = - Ch Sr i s)V^ { s) + Lc St , 

where cg k = (1 — p m ax)/<5fc- Solve this equation from time p Sk,N (t), where 
V s <" N (p s ^ N (t)) = (1-jw) up to time t: 



V s *> N (t) = (1 -p max )expf- / c Sk S S N k ' N (u)du 

+ [ Lcs k exp( '- [ c Sk Sx' N (s)ds) du 

Jp S k' N {t) \ Ju J 

r v(p S k> N (t)) T 

= (1 - Pmax )eM-v( P 5 >« N (t))) + / e-"= ] - w —-dv, 

Jo S N k ' («(«)) 

where v = Ju c 8 k SN' N {s) ds and u(v) is the inverse defined implicitly. 
Define D Sk ' N \t) = if Q Sk > N {t) < g max and for Q s *' N (t) > g max , define 

D S k ,N^ = y6 k ,N^__ ' 



(l-Pm^)eM-v(p Sk ' N (t))) 



e 



-v(p d k,"(t))_ 



s s N k ' N (t) 



Since (S%' N (t),Q 5 i" N ®) converges almost surely to (s£ ,N (t), Q°' N (t)) as 
(5^ — > 0, it follows that p Sk,N (t) converges to a limit p°> N (t). Since S^ ,N (t) > 
almost surely, v(p Sk,N (t)) — > oo as — > and for a fixed t>, ^7jz^ — j = 

Iu(v) Sn' N (s) ds so u(v) — > as 5k — > 0. We conclude D Sk ' N (t) — > almost 
surely. 

If we now take the limit as <5 fc -» in (2.9) satisfied by (S 5 N k,N (t), Q s *> N (t)), 
using loss function F§, , we see (Stf (i),Q 0,7V(i) ) satisfies (2.9) with loss 
function F, so K°' N (t) = L/S^ ,N (t) if Q°' N (t) = q max . Moreover, the win- 
dow W^ k ' N satisfies (2.3), where the rate of window reductions 

X S "' N (t) ■= W n' N (t-R S n k ' N (t)) p (Qh,N( t _ R^(t))) 
Ri k ' N (t - R & rt ,N '(*)) 
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Taking the limit as 5k — > 0, we see W^ k,N converges to W® ,N , satisfying (2.3) 
where the rate of window reductions is 

X n W — ,Af/. p 0,JV/.xx ^fcW (*)))• 

itn (t — /In (t J J 

All this means W '^ = (W?'* . . . , W% N ) and Q°' N are a solution to (2.3) 
and (2.9), so, in fact, must be W N = (Wf,...,W$) and Q N . Hence, the 
limit along subsequences is unique so we have proved weak convergence. □ 



6. Mean-field stochastic differential equations. In this section we prove 
Theorem 1. We can reformulate (2.8) as in [2]. For g €Q, 

(g,M c N (t))-(g,M c N (0)) 



1 



N 

E 



c n=l 



N 



■ ds 



+ {g(W^(s-)/2)-g{W^(s-)))dN n { S ) 



X {n £ K c } 



+ (g(W»(s)/2)-g(W»(s))) 



W«(8-R»{8)) 

R?(s-R»(s)) 



K N (s-R^(s))ds 



where £^(t) is given by 
N 



-i^ £ x {n G K c } J (g(W?(s-)/2) - g(W n ( S -))) dZ^(s) 



and 



Z^(t)-Z«(0): 



c n=l 



N, 



y n[S) R?(s-R?(s)) 



K N (s-R?(s))ds 



Hence, 

(g,M c N (t))-(g,M c N (0)) 

1 / d g( w ) 



R?(s)\ dw 



,M?(s)^ds 
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(6.2) 



+ ((g(w/2) - g(w))v, Aff (s - R?(s),dv; s, dw)) 
1 



K N (s-R?(s))ds 



N i 



R?(s-R?(s)y 

+ £ c N (t). 

We first show £^ is asymptotically small as iV — > oo. Recall 



W = ^E| C» n (a)Z>! n (d8), 



where 



O) = X{n e X-c}(ff («)/2) - g« v (a))) 



N, 



rN, 



and 



TV 



If n € -KT C , N^{s) is a point process adapted to J- n (t) with a stochas- 
tic intensity W*(s - R?(s))K N (s - R?(s))/R?(s - R*?(s)). Consequently, 
Z^(t) is a martingale. Recall that W^{s) is also adapted to .F n (i), so the 
right continuous version is //" n (t)-predictable. By Theorem T13 in [1], 

E{e c N (t)f 

i 



- N 

x E 



|£»(. 6 'rj J ['(cgJ , W ^:ff ^-«i'W)^ 



< 



jV 



< 



x^ExK^}((^( S )/2)-^( S )f |^ 

l K /V iv J \n=l Jo / 



r/.s 



where Ci is a constant depending on sup<? and sup(g)'. 
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We have the a priori bound W*(t) < a(t). Hence, 

So £c(t) tends to in L 2 . Since, in addition, £^ (t) is a martingale, it follows 
that 

p( sup \e?(t)\>\)<E(e?(T)) 2 /\ 2 , 

\te[o,T] / 

so the process £^(t),t € [0,T], converges to zero in probability. 

The processes Q N (t) and K N (t) converge in probability to the limit pro- 
cesses Q{t) and K(t), while (Mf (t), . . .,M^(t)) converges to (Mi(t), . . . , M d (t)) 
in probability where the limit processes satisfy (2.13) and (2.12). Take the 
limit of (6.2) and we have our proof. 

7. Numerical analysis. Assuming g(0) = and that g(w) — > as w — > co, 
we can rewrite (1.2) as 

<S,M c (t))-( 5 ,M c (0)> 



1 



RrXs 



■{g(w),D w M c (s,dw)) 



R c (s - Rc{s)) 



x (g(w),e(s, s — R c (s),2w) ■ M c (s, 2dw) 



e(s, s — R c (s),w) ■ M c (s, dw)) 



ds, 



where D w M c (s,dw), respectively D t M c (s,dw), is the Frechet derivative of 
the measure M c (s,dw) with respect to w, respectively t. Consequently, 

D t M c (t,dw) = --^—D w M c (t,dw) 
K c {t) 

(71) ^m^mf*-^ 

x (e{t, t - R c (t),2w)M c (t, d(2w)) 

- e(t,t- R c (t),w)M c (t,dw)). 

Neither M c (t,dw) or M c (s — R c (s),dv;s,dw) is a state, but the above 
equation does provide enough information to evolve the system. Let fJ> c (t) 
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denote the process {M c (s,dw);t — 1 < s < t} (all RTTs are less than 1). 
Using (1.1), we can evolve M c (t, dw) from t to t + St while M c (t — s + St, dw) 
is obtained by a time shift. Unfortunately, /i c is not a practical state. Even 
if we discretize and only keep the trajectory of the process on a partition 
giving {M c (si, dw);t — 1 = sq < s\ < ■ ■ ■ < s n = t}, it still requires too much 
computer memory to solve numerically. 

We can avoid this problem by defining a sequence of times t k for each class 
such that t c k+n+l — R c {t c k+n+l ) = t c k . If we pick n sufficiently large, this gives 
a fine partition. Define § c {t) = the first k such that t k > t. We will construct 
our solution by recurrence from time ti to tj+i by defining = min c (t^ f -j) 
starting from time to = 0. Next assume that, for each class, we have been able 
to calculate and save the vector V^*{t), a discretized version of M c (t k ) for 
k = m — n, . . . ,m, where m = $ c (t) — 1 (these are marginals, not the entire 
joint distribution). Also assume we save the vector of kernels V^F(t) given 
by T c (t c k ) for k = m — n,..., m, where M c (t%) = M c (t%_ x ) o T c (t{) and m is 
as above. Finally, assume that we save the kernels S c (m) =JYk= m - n T c (t k ). 

We can now evolve our system to At each step, we evolve the queue 
and the one class c, where ti + \ =t c ^ C (j..y The inverse kernel (S c (m))~ , 
gives the conditional distribution of the windows of class c one RTT be- 
fore time ti, given the window at time ti. Calculate the conditional expec- 
tation e(ti,ti — R c (ti),w). With this we can use (1.2) to calculate T c (i^ +1 ). 
Drop T C (^_J. Update S c (m + 1) = (T c (^_ n ))- 1 5 c (m)T c (^ +1 ). Finally, 
we calculate Q(ti + \) using the M c ($ c (tj) — 1) for c= 1, . . . , d. 

In Section 3.3.2 in [2] we made a smooth approximation to e(t, t — R c (t),w) 
based on the fact that one RTT in the past the window size was most likely 
the current window size minus once or twice the current window size if a 
loss was detected in the interim. With this approximation, we used (7.1) to 
evolve a discrete approximation of the measure M c (except [2] only treats one 
class). The numerical results are excellent after one corrects for the fact that 
a proportion of the connections in an Opnet simulation are in timeout (our 
model assumes connections instantaneously resume congestion avoidance if 
they fall into timeout). 

To illustrate the mean field limit, we performed an Opnet simulation 
with ./V = 200, N = 400 and N = 800 sources (see Figures 1-4). Each source 
sends packets of size 536 bytes to a T3 router with a transmission rate 
of 44.736 Megabits per second or L = 10433 packets per second. We assume 
the sources all have a transmission delay of 100 milliseconds. The router 
implements RED with p max = 0.05 for all the simulations, but we rescale 
Qmax to be 1000 with 200 sources, 2000 with 400 sources and 4000 with 800 
sources. Since Q max scales with TV, the average queue size does as well while 
holding the loss probability fixed, which in turn holds the average window 
size fixed. As N increases we see the fluctuations in the relative queue size 
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Fig. 3. Relative queue size with 800 sources. 



"N=infinily" Matlab 
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(in packets per connection) decrease. We also see that the relative queue size 
of the Matlab numerical simulation is a bit high. This is because of timeouts 
as discussed in [2]. 

Acknowledgments. J. Reynier thanks Frangois Baccelli for the supervi- 
sion of this work. R. McDonald thanks Tom Kurtz for his tutorial on how to 
bring back the particles into the proof of weak convergence to the mean-field 
model. He also thanks Ruth Williams for her suggestions [14]. Both of us 
thank Pierre Bremaud for stopping us from going completely off the rails and 
finally we thank a careful referee for his many comments and suggestions. 

REFERENCES 

[1] Bremaud, P. (1981). Point Processes and Queues. Springer, Berlin. MR0636252 
[2] Baccelli, F., McDonald, D. R. and Reynier, J. (2002). A mean-field model for 

multiple TCP connections through a buffer implementing RED. Performance 

Evaluation 11 77-97. 

[3] Bain, A. (2003). Fluid limits for congestion control in networks. Ph.D. thesis, Univ. 
Cambridge. 

[4] Billingsley, P. (1995). Convergence of Probability Measures, 3rd ed. Wiley, New 
York. MR1 700749 

[5] Dawson, D. A. (1983). Critical dynamics and fluctuations for a mean-field model of 
cooperative behavior. J. Statist. Phys. 31 29-85. MR0711469 

[6] DAWSON, D. A. (1993). Measure valued Markov processes. Ecole d'Ete de Proba- 
bility de Saint-Flour XXI. Lecture Notes in Math. 1541 1-260. Springer, Berlin. 
MR1242575 

[7] Deb, S. and Srikant, R. (2004). Rate-based versus queue-based models of conges- 
tion control. In Proceedings of the Joint International Conference on Measure- 
ment and Modeling of Computer Systems 246-257. 

[8] Deng, X., Yi, S., Kesidis, G. and Das, C. (2003). A control theoretic approach in 
designing adaptive AQM schemes. Globecom 2003 2947-2951. 

[9] Donnelly, P. and Kurtz, T. (1999). Particle representations for measure-valued 
population models. Ann. Probab. 27 166-205. MR1681126 
[10] Ethier, S. N. and Kurtz, T. G. (1986). Markov Processes; Characterization and 

Convergence. Wiley, New York. MR0838085 
[11] Floyd, S. (1999). The NewReno modification to TCP's fast recovery algorithm. RFC 
2582. 

[12] Floyd, S. and JACOBSON, V. (1993). Random early detection gateways for conges- 
tion avoidance. IEEE/ACM Trans. Networking 11 397-413. 

[13] Floyd, S., Mahdavi, J., Mathis, M. and Podolsky, M. (2000). An extension to 
the selective acknowledgement (SACK) Option for TCP. RFC 2883, Proposed 
Standard, July 2000. 

[14] Gromoll, H. C, Puha, A. L. and Williams, R. J. (2002). The fluid limit of a heav- 
ily loaded processor sharing queue. Ann. Appl. Probab. 12 797-859. MR1925442 

[15] Hollot, C. V., Misra, V., Towsley, D. and Gong, W.-B. (2001). A control 
theoretic analysis of RED. In Proceedings of IEEE INFOCOM 2001 1510-1519. 

[16] Kuusela, P., Lassila, J., Virtamo, J. and Key, P. (2001). Modeling RED with 
idealized TCP sources. In Proceedings of IFIP ATM and IP 2001 155-166. 



52 



D. R. MCDONALD AND J. REYNIER 



[17] Raina, G. and Wischik, D. (2005). Buffer sizes for large multiplexers: TCP queueing 

theory and stability analysis. In Proceedings of NGI2005 79-82. 
[18] Rosolen, V., Bonaventure, O. and Leduc, G. (1999). A RED discard strategy for 

ATM networks and its performance evaluation with TCP/IP traffic. Computer 

Communication Review 29 23-43. 
[19] Shakkottai, S. and Srikant, R. (2003). How good are deterministic fluid models 

of internet congestion control? INFOCOM 2002 497-505. 

Department of Mathematics Department d' Inform atique 

University of Ottawa Ecole Normale Superieure 

Canada France 

E-MAIL: dmdsg@mathstat.uottawa.ca E-MAIL: Julien.Rcynier@ens.fr 



